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We study the problem of finding an universal estimation scheme 
h„:R" -^R, n = l,2,... which will satisfy 



1 * 

lim -V |/i.(Xo,Xi,...,X,_i) 

;— >-nr> T ^ ^ 



E{X,\Xo,Xi,...,X,-i)\'' = 



for all real valued stationary and ergodic processes that are in L^. 
We will construct a single such scheme for all 1 < p < oo, and show 
that for p — 1 mere integrability does not suffice but L log+ L does. 

1. Introduction. The problem of sequentially predicting the next value 
Xn of a stationary process after observing the initial values Xi for < i < n 
is one of the central problems in probability and statistics. Usually, one bases 
the prediction on the conditional expectation E{Xn\XQ~^) where we write 
for brevity Xq~^ = {Xo,Xi, . . . ,Xn-i}- However, when one does not know 
the distribution of the process one is faced with the problem of estimating 
the conditional expectation from a single sample of length n. It was shown 
long ago by Bailey [5] (cf. also Ryabko [30] and Gyorfi Morvai and Yakowitz 
[10]) that even for binary processes no universal scheme /i„(Xq~^) exists 
which will almost surely satisfy limj^_).oo {hn{X^-^)-E{Xn\X^-^))=0. This 
is in contrast to the backward estimation problem where one is trying to 
estimate E{Xo\Xzlo) based on the successive observations of XZ^o- Here, it 
was Ornstein [29] who constructed the first such universal estimator for finite 
valued processes. This was generalized to bounded processes by Algoet [1], 
Morvai [16] and Morvai Yakowitz and Gyorfi [18]. For unbounded processes. 
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several universal estimators were constructed (see Algoet [3] and Gyorfi et 



Returning to our original problem of sequential prediction it was already 
observed by Bailey that backward schemes could be used for the sequential 
prediction problem successfully in the sense that that the error tends to 
zero in the Cesaro mean. To establish this, he applied a generalized ergodic 
theorem which requires some technical hypotheses which were satisfied in 
his case. 

Over the years some authors have extended this work, namely of adapting 
backward schemes to sequential prediction, but only for bounded processes 
(see Algoet [1, 3], Morvai [16], Morvai Yakowitz and Gyorfi [18] and Gyorfi 
et al. [9]). 

Another approach to the sequential prediction used a weighted average 
of expert schemes, and with these results were extended to the general un- 
bounded case by Nobel [28] and Ottucsak [12] (see also the survey of Feder 
and Merhav [8]). However, none of these results were optimal in the sense 
that moment conditions higher than necessary were assumed. It is our pur- 
pose to obtain these optimal conditions and to show why they are necessary. 
We consider the following problem for 1 <p<oo. Does there exist a scheme 
hn{XQ~^) which will satisfy 



for all real valued stationary and ergodic processes that are in L^. The 
only case that has been solved completely is when p is infinity. Even the 
recent schemes Nobel [28] and Gyorfi and Ottucsak [12] put a higher moment 
condition on the process than is manifestly required. Our main result is that 
the basic scheme first introduced by the first author in his thesis can be 
adapted to give a scheme which will answer our problem positively for all 
1 <p. For p=l, we shall show that stronger hypothesis is necessary, as is 
usually the case, and we will establish the convergence under the hypothesis 
that Xo GLlog+L. 

In the third section, we will show how this hypothesis cannot be weakened 
to Xq G L^. Our construction will be based on one of the simplest ergodic 
transformation, the adding machine, and illustrates the richness of behav- 
ior that is possible for processes that are almost periodic (in the sense of 
Besicovich) . 

As soon as one knows that the errors converge to zero in Cesaro mean, it 
follows that there is a set of density one of time moments along which the 
errors converge to zero. However, in general one does not know what this 
sequence is. In the framework of estimation, schemes adapted to a sequence 
of stopping times (see [19-21, 23-26]) one may ask can one find a sequence 



al. [9]). 




a.s. 



i=l 
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of stopping times with density one along which the errors of a universal 
sequential prediction scheme will tend to zero. We have been unable to do 
this in general and regard it as an important open problem. Finally, we refer 
the interested reader to some other papers which are relevant to this line of 
research [2, 11, 17, 27, 34]. 

Some technical probabilistic results have been relegated to the Appendix, 
they are of a classical nature and may be known, but we were unable to find 
references. 



2. The main result. Let X = denote a real- valued doubly infinite 

stationary ergodic time series. Let 

= {Xi,Xi+i, . . . , Xj) 

be notation for a data segment, where i may be minus infinity. Let 

^~ = xzL. 

Let Gfc denote the quantizer 

( 0, if -2-^^ < X < 2-^ 

Gk{x) = < -i2-^, if -{i + 1)2"'^ <x< -i2-^ for some i = 1, 2, . . . , 
\i2-^, iii2-^ <x<{i + l)2-^. 

Define the sequences Xk-i and Tt recursively (/c = 1,2, . . .). Put Aq = 1 and 
let Tfc be the time between the occurrence of the pattern 

B{k) = J, . . . , = GkiXzl^J 

at time —1 and the last occurrence of the same pattern prior to time — 1. 
More precisely, let 

Tk = min{i > : Gt,{Xzl2,-t) = Gk{Xzl^J}. 

Put 

Afc = 7"A: + Afc_i. 

Define 

(2.1) ^' = \T. ^-rr 

To obtain a fixed sample size t > version, let Kt be the maximum of integers 
k for which < t. For t > 0, put 

(2.2) R-t = - V X-r,. 

i<i<Kt 
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Motivated by Bailey [5] , for t>0 consider the estimator 

which is defined in terms of {Xq, . . . ,Xt-i) in the same way as R-t{uj) was 
defined in terms of . . . {T denotes the left shift operator.) The 

estimator Rt may be viewed as an online predictor of Xf. This predictor has 
special significance not only because of potential applications, but addition- 
ally because Bailey [5] proved that it is impossible to construct estimators 
Rt such that always Rt — E{Xt\XQ~^) — )• almost surely. 

Theorem 1. Let {Xn} be stationary and ergodic. Assume that 

i?(|Xo!log+(|Xo|))<oo. 

Then 

1 * 

(2.3) lim -y\Ri- E{Xi\X'~^)\=0 a.s. 
and 

1 * 

(2.4) lim -y^\Ri-Xi\ = Ei\E{Xo\Xzl)-Xo\) a.s. 

i=l 

Furthermore, if for some 1 <p < oo, E{\Xo\p) < oo, then 
1 * 

(2.5) lim -y"\Ri-E{Xi\X'-^)\P = a.s. 

1=1 

and 

1 * 

(2.6) lim -S2\Ri-X,\P = E{\E{Xo\Xzl)-Xo\P) a.s. 

i=l 

Proof. The proof will follow the same pattern in all four cases. We will 
verify that the backward estimator scheme converges almost surely and we 
will see that the sequence of errors is dominated by an integrable function. 
This allows us to conclude from the generalized ergodic theorem of Maker 
(rediscovered by Breiman, cf. Theorem 1 in Maker [15] or Theorem 12 in 
Algoet [2]) that the forward scheme converges in Cesaro mean. For the first 
case, we will carry this out in full detail, for the others we will just check 
the requisite properties for the backward scheme. First, consider 

^^ = \T. [X-r,-G,{X.r,)] 
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+ 1 E [G,{X^r,)-E{G,{X^,^)\G,^r{XzlJ)] 

l<j<k 

+ [EiG,iX^r,)\G,^iiXzl^J)-E{X^r,\Gj^^iXzi^J)] 

l<3<k 

+ 1 E [E{X^r,\G,^,{XzlJ)-E{X,\G,^,{Xzl^J)] 

l<j<k 

+ E{Xo\G,,iiXzl^J) 

= Ak + Bk + Ck + Dk + Ek. 
Obviously, 

i<i<fc 

Now we will deal with D^- By Lemma 1, in Morvai, Yakowitz and Gyorfi 
[18], 

P{X^r, G C\G,^i{XzlJ) = P{Xo G C\G,^i{XzlJ). 

Using this, we get that Dk = 0. 

Assume that i?(|Xo| log"'"(|Xo|)) < oo. Toward mastering B/^, one observes 
that {X-T- } are identically distributed by Lemma 1 in Morvai, Yakowitz and 
Gyorfi [18] and is an average of martingale differences. By Proposition 1 
in the Appendix, — > almost surely and E{supi^i. < oo. 

Now we deal with the last term E/^. By assumption, 

a{Gj{Xzl))ta{X-). 
Consequently by the a.s. martingale convergence theorem, we have that 
E{Xo\Gj{Xzl)) ^ E{Xo\X-) a.s., 

and thus 

Ek^E{Xo\X~) a.s. 

Furthermore, by Doob's inequality, cf. Theorem 1 on page 464, Section 3, 
Chapter VII in Shiryayev [32], E{sup-^^f,\Ek\) < E{sup^^j\E{Xo\Gj{Xzl.))\) < 
oo. 

We have so far proved that 

Rk — >■ E{Xq\X.~) almost surely 
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and 



E[snp\Rk\) < oo. 



This in turn implies that 



and 



hm i?_t = -E'(Xo|X ) almost surely 



E[ sup|i?_j I ) < oo. 

■Kt 



Now since E{Xo\Xzl) ^ E{Xo\^~) almost surely, 

lim \R^t — E{Xo\XZt)\ = almost surely 
and by Doob's inequality, 

^(sup|^_t - E{Xo\Xzl)\) < e(sup\R, 

^eUuv\e{x^\xz\)\ 

< oo. 

Now, apply the generalized ergodic theorem to conclude that 
t t 
hm - ^{\R.,- EiXolXzDliTu:)) = lim -^\R, - E{X,\X^, 

= a.s. 

and the proof of (2.3) is complete. Similarly, 

lim \R-t — XqI = \E{Xq\XZ^ — XqI almost surely 

and 

£;fsup|i?_t - Xq\\ < E(sup\R^t\) + E{\Xo\) < oo 

and the generalized ergodic theorem gives 

t , t 



hm iV(|i2_,-Xo|(ra;))= lim \S2\Ri-Xi\ 

i=l i=l 

Ei\EiXo\XzL) - Xo\) a.s 



and the proof of (2.4) is complete. 
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Now, we assume that for some 1 <p < oo, £^(|Xo|^) < oo, and we prove 
(2.5). Observe that 



|i?.P<3P 



-j +\Bk\P + \Ek\P 



and since by Proposition 2 in the Appendix — >■ almost surely and 
E{supi^i^\Bk\^) < OG and by Doob's inequality, E{supi^i^\Ek\^) < oo and 
Ek — 7> £'(Xo|X~) almost surely (for the same reason as before). 
We have so far proved that 

Rk — >■ E{Xq\X.~) almost surely 

and 

E(sup\Rk\A < oo. 

This in turn implies that 

lim i?_t = E{Xq\'X.~) almost surely 

t-^oo 

and 

E(sup\R^t\p) < oo. 

Now since E{Xo\Xzl) ^ E{Xo\^^) almost surely, 

lim \R_t-E {Xq I XZI ) T = almost surely 

i— )-oo 

and by Doob's inequality, 

E(sup\R-t - E{Xo\Xzl)\'') < 2PE(svLp\R-t\^ 

+ 2*'^fsup|S(Xo|Xl/)|P 
^i<t 

< oo. 

By Maker's (or Breiman's) generalized ergodic theorem (cf. Theorem 1 in 
Maker [15] or Theorem 12 in Algoet [2]) one gets (2.5). Similarly, 

lim \R_t -Xq\p = \E(Xq\XzI^) - Xq\p almost surely 

and 

^fsup|^_f - Xq\A < 2PE(sup\R.t\^) + 2PE{\Xo\P) < oo. 

Now, apply Maker's (or Breiman's) generalized ergodic theorem to prove 
(2.6). The proof of Theorem 1 is complete. □ 
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Remark 1. We are indebted to the referee for the following remark. 
Using the notion of Bochner integrability of strongly measurable functions 
with values in cq and the extension of Birkhoff 's ergodic theorem to Banach 
space valued functions (see Krengel [14], page 167), one can give an easy 
proof of Maker's theorem. The key condition now becomes the fact that the 
norm of the sequence {/ — fk} in cq is integrable, and then the convergence 
in the norm of cq allows one to deduce the convergence of the diagonal 
sequence which is what appears in Maker's theorem. 

3. Integrability alone is not enough. In Theorem 1 for the Cesaro con- 
vergence in the norm, we assumed that Xq was not merely in but 
in log"^ L. In this section, we shall show that some additional condition 
is really necessary. We will first give an example to show that the maximal 
function of the conditional expectations supi<„|£'(Xo|Xr^)| may be non- 
integrable for an integrable process. We shall do so in an indirect fashion 
by showing that the the estimate E{Xn\X^~^) for E{Xn\X!!:'^) does not 
converge in Cesaro mean to zero. This means that even though we are may 
be in the distant future the information of the prehistory can make a serious 
difference. This example serves as a model for the main result of the sec- 
tion where we show that for any estimation scheme for E{Xn\XQ~^) which 
converges almost surely in Cesaro mean for all bounded processes there will 
be some ergodic integrable process where it fails to converge. Indeed the 
processes that we need to consider are countably valued and in fact are zero 
entropy and finitarily Markovian (see below for a definition) , a generalization 
of finite order Markov chains. 

First, let us fix the notation. Let {Xn}^=-oo be a stationary and ergodic 
time series taking values from finite or countable alphabet X. (Note that 
all stationary time series {X„}^g can be thought to be a two sided time 
series, that is, {X„}J^„^.) 

Definition 1. The stationary time series {-'^n} is said to be finitarily 
Markovian if almost surely the sequence of the conditional distributions C 
{Xi\X^f,) is constant for large k (it is random how large k should be). 

This class includes of course all finite order Markov chains but also many 
other processes such as the finitarily determined processes of Kalikow, Katznel- 
son and Weiss [13], which serve to represent all isomorphism classes of zero 
entropy processes. 

For some concrete examples that are not Markovian, consider the follow- 
ing example. 

Example 1. Let {M„} be any stationary and ergodic first order Markov 
chain with finite or countably infinite state space S. Let s E S be an arbitrary 
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state with P{Mi = s) > 0. Now let Xn = I{m„=s}- By Shields [31], Chapter 
I.2.C.1, the binary time series {Xn} is stationary and ergodic. It is also 
finitarily Markovian. Indeed, the conditional probability P{Xi = 1\X^^) 
does not depend on values beyond the first (going backward) occurrence of 
one in X^^ which identifies the first (going backward) occurrence of state s 
in the Markov chain {M„}. The resulting time series {^n} is not a Markov 
chain of any order in general. 

We note that Morvai and Weiss [22] proved that there is no classifica- 
tion rule for discriminating the class of finitarily Markovian processes from 
other ergodic processes. For more about estimation for finitarily Markovian 
processes, see Morvai and Weiss [23, 24, 26]. 

Theorem 2. Let X = {0,10-^,^, k = 1,2,..., m = 1,2,...}. There ex- 
ists a stationary and ergodic finitarily Markovian time series {Xn} taking 
values from X such that E\Xq \ < oo and 

1 ^ 

]imsnp-Y,\E{Xn\Xr') - EiXn\XrJ)\ = oo 

almost surely. Therefore, 

E(snp\E{Xo\Xzl)\)=oo. 

Proof. Let Q be the one sided sequence space over {0,1}. Let oj = 
{uji,u}2, . . .) € r2. Define the transformation T : — )■ $7 as follows: 

{0, if ojj = 1 for all j < i, 
1, if cjj = and for all j <i:ujj = 1, 
oji, otherwise. 

Consider the product measure P = 11^^ {1/2, 1/2} on Q which is preserved 
by T. It is well known (cf. Aaronson [4], page 25) that {il.,P,T) is an ergodic 
process, called the adding machine or dyadic odometer. The process will be 
defined by a function / : ^> M as Xn{uj) = fiT'^uj). Let I3 < ■ ■ ■ < lk~i < 
Ik — > 00. Define ak = a and bk = b when k = 2^ + b where 1 < 6 < 2". Define 

Ck = {oj ■■ UJi = 1 for 1 < i < Ik, uji^ = 0}, 

clearly P{Ck) = 2-^K Let 

Dk = {uj -.Ui = 1 ior 1 <i < k - ak,uji^-a^^ = 0, = 1 for lk-ak<i< k} 

and 

Ek= U T-Wk. 

1=0 
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Notice that 

Ek = {u-.uji^^^ak = O'^j = 1' all h-ak<j< k}- 

It is clear that if the Z^'s are chosen large enough so that for all k' > k 
h < h' - 2afc/: 

• the family Ck,Di k,l >3 consists of disjoint sets, 

• the intervals [Ik — a^^lk — 1] are also disjoint and therefore the sets E^, are 
independent. 

The signaling function u is defined by 

oo 

nM = ^10"%,(a;) 

fc=3 

and the main contributor to / will be 

fc=3 

Clearly, 

oo oo ^ oo 2° ^ 

fc=3 fc=3 a=l 6=1 

a=l ^ ' 

Define a process by /(a;) = u{bS) + v{ijj) and 

Notice that G {0, 10^^, = 3,4, . . .}. Observe that PiE^) = 2""^ and 

oo oo 2" oo 

E^(^^) = EE2-" = Ei = -- 

fc=3 a=l b=l a=l 

By the Borel-Cantelli lemma, a point lo belongs to Ek infinitely often. When 

u € Ek, 

T'"uj G Dk for some < io < 2^>^~'"'~^ - 1. 
For u € Ek, we know that X^,{^) = lO"''- ^t time io + 2''="'''=-^ - 1, 

fO, ifj = l, 

\oJj, otherwise. 
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Let's compute for a fixed io such that T^'^uj e (i.e., Xi^ = 10"*"') 

Take N = 2''="'^'' and consider 
1 ^ 

n=l 



For a; € T~'"Dk (i.e., = 10^^ ), we know that 
Therefore if X.^_^2'fe-"fe-i > 0) then we must have 



1, ifl<J<4.-l, 
LJo, otherwise. 



G Cfc U IJ (Cj U Dj) 

j>k 

(because if k' <k then < and C^/ , Df^/ are defined by zero values of 
with i < Ik) and 

-^l^io+2'fc-''fe-M^O J 

(2/3)«fc+i 



> 



2 • 2-'fe 

l2'.p^"^' 
2 13 



Similarly, 



_ 2'fe/3<^fc2-'fe + Y.j>k 2'V3"^ 2-'^ + IO-J'2-'^ + 

10-^-1 + E£o(2/3)'^''+' 



< 



2 • 2-'fc 



2^K 



10-^-^ + - 3 



< 4 • 2'*= - 
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On the other hand, X_l^ determines exactly the value of Xi^^^2'k -"fe -i ■ 

There are four cases. If X■^_^_2l^-a^.-l is equal with 0, |a|-, or for some 
k<k':W-''' or That is, 



io+2'k-°-k- 










= < 


lo-'^' 












lo. 



IX' 



if T^o+2''= G Dfe/ for some k<k', 

if T^o+2''=""'="'a; G Ck' for some A: < k', 
if otherwise. 



Now 



> < 



0.52^'= 
10" 



2"-k 

3^' 



io+2'fe-''fe-i-lN, 
00 / 1 



0.52''= 



if T*<'+2'*""* a;GCfc, 

if r*"+2'fc-°fc-i^ ^ g^^g ^ ^ 

if T^o+^'^'^^'^a; G Cfc. for some k < k', 
if otherwise. 



where we assumed that lu — 2ai,' > h- if k' > k. Now 



io+2'fe""fc 



> -2''' I - 



'k)+2'-k-"-k- 



) - E{X^^^2l^^^a^-l\X 



«0+2'fc"°fc"-'-ll 
00 ) 



Therefore, 



n=l ^ ^ 



_2'fc - 



_ 1 /4y'= 

Since limsup,^._^Qo = 00, the proof of Theorem 2 will be complete as soon 
as we verify that the process is ergodic and finitarily Markovian. The first 
property follows from the fact that T is an ergodic transformation. To see 
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the second, what we need to do is to show that the values of /(T "tj) will 
reveal to us more and more of the values of cOm as n increases. Almost every 
point is in infintely many T^^ ^ -^j's- For any such j, there is a unique 
i < 2'j ~"j such that T*~^ ^ ^ cj G Dj and this is revealed to us by the value 
of / at the point in the negative orbit of uj. This information will give us 
the values of LOm for all m up to Ij — aj and this completes the proof. □ 

Remark 2. The referee pointed out that a simpler and equivalent for- 
mulation of the first statement of the theorem above is as follows. 

Let X = {0, lO"*"', = 1, 2, . . . , m = 1, 2, . . .}. There exists a stationary 
and ergodic finitarily Markovian time series {^n} taking values from X such 
that i?|Xo| < oo and 




almost surely. 

[This is because E{Xn\X!::'^){uj) = £^(Xo|XZ^)(T"cj) and by the ergodic 
theorem 

1 ^ 

hm -Y,EiXn\X!l-J) = EiXo)<oc 

Af— s-oo iV ^ — ' 
n=l 

almost surely.] 

Theorem 3. iei ^ = {0, 10"'^, ^fc = 1,2, . . . ,m = 1,2, . . .}. Suppose 
— 7> M zs a scheme that for any hounded ergodic finitarily Markovian 
process {Yn} taking values from X , almost surely satisfies 

1 ^ 

lim ^|^(y„|F-i)-/,„(y-i)|=o. 

W— >-oo iV ' 
n=l 

Then there is an ergodic finitarily Markovian process {Xn} taking values 
from X for which 

E\Xq \ < oo 

and 

1 ^ 

hmsup- V \E{Xn\X^^~^) - hn{X^~^)\ = oo 

almost surely. 
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Proof. I. A Master process. We shall prepare a master process with 
many possibilities for constructing a process such as in the earlier example 
with Ik in a fashion that will be dictated by the estimation scheme. For 
i l^j ^n, define 

q{n,j) = (n^ +i)! 

and sets 

Cq{n,j) = {uj:uji = l ioT l<i< q{n,j),UJq(^n,j) = 0}, 
clearly P(C,(„,,)) = 2-«("'J). Let 

^q{nj) = {^^ : = 1 for 1 < i < q{n, j) - j, 

^q{n,j)-j = 0,a;i = 1 for q{n,j)-j < i < q{n,j)} 

and 

29(»i.i)-i-i_i 
i=0 

Notice that 

^q{n,j) = i^'-^qin,])-] = 0,Wi = 1, for ah q{n,j) -j<i< q{n,j)} 

and it follows that the sets {-E'g(„ j), 1 < j < ra, n G N} are mutually indepen- 
dent. Letting 

oo n 
n=lj=l 

the master process is defined by Yn{i^) = u{T"u}). For later use, observe that 
the Dq(^n,j)^ are disjoint. 

We will need the following easy consequence of our assumption on the 
estimators /i„ , namely that for any bounded process defined on as in the 
theorem and for any k there is an integer and a set C with P{Hk) > 
1 - and for ah uj e Ht and m > we have: \hmiYo, . . . , yC^-^))] < ^. 

IL The construction. We shall now define a sequence l^, k = 2*^* + bk, 
1 < 6 < 2" inductively, together with functions which are bounded. As k 
tends to infinity, the Vk will converge to v and we will use u + v to get our 
desired process. We may take f 2 = to start the inductive construction. 

Assume that we have already defined ^3 < /4 < • • • < a subsequence of 
the g(n,j)'s and 

2=3 ^ ^ 
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we want to define Ik and Vk- Recalling the notation k — 1 = 2°"^-^ + we 
have that hk_\ = 6^ — 1 unless /c — 1 = 2", in which case a^-x = a — 1 and 

Since v^-i is bounded, the process defined by 

= A_i(T"a;) = u{T^uj) + v^-x^T^uj) 

is bounded. Now, by assumption, there is an iV^ and a set with P{Hk) > 
1 — and for all G i/^ and m > we know that 

Choose n large enough so that 2''("''^fe)-"fe > lOiVfc and we make sure that 
q{n,ak) - > lOk-i- Set 

4 = q{n,ak) 

and 

This defines a new process 

It is important to observe that if for some iq < 2''=""'^"^ we have T^^oj G Di 
then for ah 0<j<io + 2'fe~"fc"i - 1 

This is because the way Ci^, is defined, we know that T^o+'i''' gg^^j |-,g 

in C/j, which implies that earlier iterates of to cannot be there. Indeed, 

Cg(„j) C T^''' 'Di^ for ah q{7i,j) > k, 

which implies that during all the later stages of the construction the values 
of X-'^ in this range will not change. So we will have for 



k 



k=3 

and 



k=3 ^ ^ 



that 



Xj (w) = Xf {uj) for ah < j < io + 2^'=-''*-^ - 1, 



if T^^u^Di^. 

It is clear that if the 4's are chosen large enough so that for all k' > k 
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• the sets {Cfc,Dfc}^3 are disjoint, 

• the intervals [1^ — a^, — 1] are also disjoint and therefore the sets Ei^^ are 
independent. 

The signaling function u is bounded and the main contributor to / will be 

fc=3 



Clearly, 

oo 2" 



< oo. 



k=3 k=3 a=l 6=1 a=l ^ ^ 

Define a process by f{ui) = u{uj) + v{(jj) and 

X„(a;) = /(r"a;). 

Note that X„ € as advertised. 

III. Checking the properties. Observe that P{Ei^) = 2""'= and 

oo oo 2" oo 

fc=3 a=l h=l a=l 

By the Borel-Cantelli lemma, a point a; belongs to Ei^, infinitely often. In 
addition, since P{Hk) >l-2-^, almost every point will belong to Hi^ for 
all sufficiently large k. Suppose then that u € Ei^ fi H^^. When lo € Ei^, 

T'"u € Dk for some < io < 2^'=-'"'-^ - 1. 

For ojeEi^, we know that Xi,,{u) = lO^'fe. At time zq + 2''=""'=-^ - 1, 

fo, ifi = l, 

(^.0+2'.-'".--l(^)) _ J 1^ ifl<i</,-l, 

\u!j, otherwise. 

Let's compute for a fixed iq such that T'^a; S D^^. (i.e., X^^ = 10~'*^) 
For cj G T~^"Di^ (i.e., = lO^'fc ) we know that 



1, ifl<J<^fc-l, 
(^1, otherwise. 



(r«+2'^-"^-'u;) 
Therefore if X-^^_^_2ik-'^k-^ > 0? then we must have 



m>k 1<«,1<J<2" : (j(n,j)>«fe 
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because k' < k then Ik' < Ik and the Ck' , are defined by zero values of uii 
with i <lk, and similarly for with q{n,j) < 1^, 

^ (2/3)^ 
- 2 • 2-'fc 

1 , ^2^'^*+^ 
2 

Similarly, 

2'fc/3'^fc2-'fc + ^^.>;.2'V3"^-2-'^- + Eg(n,j)>Zfe 10-'?("'-'')2-9("'j) + 



< 



2 • 2-'fe 



< 4 . 2'fc - 



(fe— 1) 

On the other hand, because uj S and our remark about Xj = for 
< 1 < 2'''~''fc~^ - 1, we have that 



Therefore, if we take = 2''=""'= 



10 



1 ^ 11 /o\«fc+l 



1 , /2^'^*+^ 



n=l 

— 2~'fc+"fe ^2^''" 

Since limsupfc_j.oo 0^ = 00, the proof of Theorem 3 is complete. □ 
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APPENDIX 

The next result is a generalization of a result due to Elton; cf. Theorems 
2 and 4 in Elton [7]. 

Proposition 1. For n = 0,1,2, ... , let Tn be an increasing sequence of 
a -fields, and Xn random variables measurable with respect to Tn, be iden- 
tically distributed with £'(|Xo| log''~(|Xo|)) < oo. Let gn{Xn) be quantizing 
functions so that for all n, \gniXn) — Xn,\ < 1, and for an increasing se- 
quence of sub a -fields, Qn ^ J-n such that gn{Xn) =Yn is measurable with 
respect to Gn, form the sequence of martingale differences 

Zn=gn{Xn) " ^(^n ) = y„ - EiYnlGn-l)- 

Then 

(A.l) eIsvlp 

yi<n 

and 

1 

(A.2) lim — > Zi = almost surely. 

n— >oo ri — ' 

i=l 

Proof. We follow Elton [7], who gave the proof when the martingale 
differences Z„ are identically distributed. Write 

where \Y^\ < n and \Y"\ > n. Now 

z„ = - ii;(K|g„_i) + y^' - i?(i;:'|^n-i). 

Since for any sequence of real numbers {aj}, 

1 

sup — 

l<n n 

(cf. Lemma 7 in Elton [7]), letting 

dn = Y:,-E{Yl^\Gn-l) 

and 

en=Y::-E{Y.::\Gn-i) 



1 " 

n ^-^ 

4 = 1 



< OO 
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we get 
(A.3) 

(A.4) 

(A.5) 

(A.6) 



E sup — 

\ Kn n 



i=l 



< 2E sup 

\l<n 



< 2E sup 

V Kn 



n 

i=l 
n 



2=1 



IE sup 

\ Kn 



n ^ 

E~^» 
^ — ' 1 



For (A.5) by Davis' inequality (valid for all martingale differences cf. e.g., 
Shiryayev [32], page 470), we get 



IE sup 

\ Kn 



n \ / °° 1 

E-^' <2Si^ 5;^(d 



0.5- 



vi = l 



< 2B 



/ oo 



.4=1 



0.5 



Now, E{{dif) < ^((y/)2). But since \Yi - X^] < 1, we get 

EiiY^f) = ii;((y,)'/{|K.i<.}) < + lfIm-l\<^}) 
and the Xj's are identically distributed therefore 



oo 

E72^((^^ + l)'^{|x,-i|<.}) 



i=l 



5] i?((X, + l)2/, 



{i-l<|X,-l|< 



i=l 



<KEi\Xo\), 



where X is a suitable constant (cf. the last line of the proof of Lemma 1 in 
Elton [7]). 
For (A.6), 

E\en\ < 2E\Y;'\ < 2E{(1 + |X„|)/{|x„|>n_i}) 

and now Xn are identically distributed. Now since -E(|X| log'^dX])) < oo. 
Lemma 2 in Elton [7] implies that 



oo ^ 

E -^((1 + l^-l)^{|X„|>n-l}) < OO 



n=l 
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and so 



2E sup 

\ Kn 



" 1 \ " 1 , 



i=l 



i=l 
< OO 



and this completes the proof of (A.l). 
Now, we prove (A.2). By (A.4), 



n 



is a martingale with 



i=l 



sup E{\Un\) < OO 
Kn 



and by Doob's convergence theorem C/„ converges almost surely. Then by 
Kronecker's lemma (cf. Shiryayev [32], page 365), 



1 " 

hm -S^Zi = 

n— >oo n ^—^ 

i=l 



almost surely. The proof of Proposition 1 is complete. □ 

Proposition 2. Let {(pmJ^n} be a martingale difference sequence. If, 
for some 1 <p< oo, sup]^<„ E{\(j)n\^) < oo then 



(A.7) 

and 

(A.8) 



1 " 

lim — > (/'i = almost surely 
n— >oo n ^ — ' 



i=l 



El sup 

\ Kn 



1 " 



i=l 



< oo. 



Proof. Choose a positive integer K such that K{p — 1) > 1. Define 

1 " 

fn =-y^4>i- 



Assume first that 1 < p < 2. Now by Theorem 2 in von Bahr and Esseen 
[33], 



(A.9) 



^(i/j^)<2^sup^(i^.n=2 -p^^;ff'^) 
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Define 



71=1 



By (A. 9), and since by assumption supx<„ i?(|0n|^) < cw, K{p — 1) > 1, 



(A.IO) 
Define 

and let 



EiF) = 2j2 



SUPi<i£^(|0i| 



n=l 



n 



A'(p-l) 



< oo. 



gn= max 

l<fc<(n+l)^-n^ 



n=l 



To complete the proof of (A. 7) and (A. 8), it is enough to show that E{F + 
G) < oo. By (A.IO), it is enough to show that E{G) < oo. Now for some 

m = n^ + k, l<k<{n + l)^ - , 



fm = (/n^+fc - /n^f ) + fn 



and 



Now 



1=1 j=l 



and so 

iw-/n^r<2^| 



n 



i=l 



+ 



n 



(n + 1)^-71^-1 



i=l 



1 "A 

j=l / 
1 

, = 1 / 



is: 



+ 



Now 

9n<2Pi 



2^' 



K K 



i=l 



+ 



.7 = 1 
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Now by von Bahr and Eseen [33] and Doob's inequality (cf. e.g., Theorem 
1, Chapter 3 in Shiryayev [32]), 



E{9n) < 2P 



+ 



n 



2K 



p 



n 



2n^(supi?(|</.i|P; 

(n+l)^-n^ 



< 2P 



(n + 1)^ -n 



K „K\P 



n 



2K 



2n'^ sup E{\^i\P: 

■Ki 



P 



(p-l) 



n 



K 



Ki 



and the right-hand side is summable. We have completed the proof for 1 < 
p<2. 

Now assume 2 < p < oo. By the theorem of Dharmadhikari, Fabian and 
Jogdeo [6], 

supi<j£;(|(^i|p) 



E{\fn\n<C{p)- 
Applying this one gets that 



^[f:i/»n^i:gM """"'''''^''''' <°°- 



\n=l 



n=l 



Thus, 



< oo almost surely 



n=l 



and this yields (A. 7) and (A. 8). The proof of Proposition 2 is complete. □ 

Remark 3. The referee pointed out that the second statement of the 
preposition above could be proved in a simpler way as follows. By maximal 
Doob inequality and Burkholder inequality, we obtain 



i?sup 



1 " 



1=1 



p\ i/p 



< 2j»max< 1 



1 



{p-iy 



.1=1 



p/2- 



1/p 



Now if p > 2, then by the triangle inequality in Lp/2, 



,i=l 



p/2- 



2/p s 1/2 



< 



.1=1 



iP 



2/p 



1/2 
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If p < 2, then since 



/ \ p/2 



for all positive numbers Oj we get 




oo 1 \ 

1=1 / 



<Cpsup(i?i</.in 



p\i/p 
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